PlantTFDB
Plant Transcription Factor Database
v4.0
Previous version: v3.0
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID Thecc1EG030445t1
Common NameTCM_030445
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma
Family Trihelix
Protein Properties Length: 419aa    MW: 47208 Da    PI: 5.4124
Description Trihelix family protein
Gene Model
Gene Model ID Type Source Coding Sequence
Thecc1EG030445t1genomeCGDView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1trihelix78.78.5e-25295379186
          trihelix   1 rWtkqevlaLiearremeerlrrgklkkplWeevskkmrergferspkqCkekwenlnkrykkikegekkrtsessstcpyfdqle 86 
                       rW++ ev+aLi +r+ +e+++r + +k   W+e+s  m   g+ rs+k+Ckekwen+nk+++k   + kk+  e+s++c+yf++l+
  Thecc1EG030445t1 295 RWPDAEVQALIMLRSALEHKFRVTGSKCSIWDEISVGMYNMGYCRSAKKCKEKWENINKYFRKSMGSGKKH-LENSKRCAYFHELD 379
                       8********************************************************************99.577789******97 PP

Protein Features ? help Back to Top
3D Structure
Database Entry ID E-value Start End InterPro ID Description
PROSITE profilePS500906.458288352IPR017877Myb-like domain
CDDcd122033.59E-25294358No hitNo description
PfamPF138371.3E-17294380No hitNo description
Gene Ontology ? help Back to Top
GO Term GO Category GO Description
GO:0003677Molecular FunctionDNA binding
Sequence ? help Back to Top
Protein Sequence    Length: 419 aa     Download sequence    Send to blast
MELFNGGRET FPHHVAPFPD LTAIGMIESA EDSMMGDHRP NLPPQKLRPI RYNGRSPASS  60
QAEDTSEFAE VVELVGDEVC PVNGDSGEYL EPPVKAEVGD VVDTGGGDGP PNSEHGGDSS  120
SSSSSDSDDN DMSTTLNEPL NRKRKRKKSK KIELFLEKLV MKVMEKQELM HKQLIETIEK  180
RERERIIREE AWKQQEMERI KRDEEARAQE TSRSIALISF IKNVLGHDIE IPVQSTISCM  240
EETGGKEMSE GHIQKDMISL CDPINRWQEG KMQANGGENH VHEDIGINCD PSNRRWPDAE  300
VQALIMLRSA LEHKFRVTGS KCSIWDEISV GMYNMGYCRS AKKCKEKWEN INKYFRKSMG  360
SGKKHLENSK RCAYFHELDM LYKNGLVSPA NHVNWTKDEN EDRGELTPKA GSENVIGA*
Nucleic Localization Signal ? help Back to Top
NLS
No. Start End Sequence
1140147NRKRKRKK
2141146RKRKRK
3141147RKRKRKK
4142150KRKRKKSKK
5143147RKRKK
Binding Motif ? help Back to Top
Motif ID Method Source Motif file
MP00552DAPTransfer from AT5G47660Download
Motif logo
Regulation -- PlantRegMap ? help Back to Top
Source Upstream Regulator Target Gene
PlantRegMapRetrieveRetrieve
Annotation -- Protein ? help Back to Top
Source Hit ID E-value Description
RefseqXP_007026375.10.0Duplicated homeodomain-like superfamily protein, putative isoform 1
TrEMBLA0A061GIL10.0A0A061GIL1_THECC; Duplicated homeodomain-like superfamily protein, putative isoform 1
Orthologous Group ? help Back to Top
LineageOrthologous Group IDTaxa NumberGene Number
MalvidsOGEM94112736
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT5G47660.16e-44Trihelix family protein
Publications ? help Back to Top
  1. Motamayor JC, et al.
    The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
    Genome Biol., 2013. 14(6): p. r53
    [PMID:23731509]